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The  Plane- Wave  Detection  Problem 


William  W.  Symes 

Department  of  Computational  and  Applied  Mathematics 

Rice  University 
Houston,  TX  77005 
U.S.A 


Abstract 

The  plane-wave  detection  problem  is:  to  estimate  the  incidence 
angle  and  waveform  of  a  transient  plane  traveling  wave,  from  samples 
recorded  at  a  linear  array  of  receivers.  This  simple  problem  shares  sev¬ 
eral  important  mathematical  features  with  other  inverse  problems  of 
wave  propagation,  and  is  of  interest  in  its  own  right  as  a  model  prob¬ 
lem  in  ocean  acoustic  signal  analysis.  Straightforward  formulation  as 
a  nonlinear  least  squares  problem  yields  a  nonconvex  objective  for 
which  the  minima  are  not  stably  dependent  on  the  data.  In  contrast, 
an  infeasible  point  formulation,  in  which  the  signal  at  each  receiver  is 
explained  to  some  extent  independently,  proves  to  yield  a  smooth  con¬ 
vex  optimization  problem  with  stable  optima.  Numerical  experiments 
illustrate  the  theoretical  results  about  the  infeasible  point  approach, 
differential  semblance  optmization. 
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0  Introduction 


This  paper  presents  a  detailed  study  of  a  simple  but  nontrivial  nonlinear 
least-squares  problem  in  infinite  dimensions.  We  call  the  problem  studied 
here  the  “(plane-wave)  detection  problem.”  It  is  a  simple  model  for  some 
inverse  problems  in  wave  propagation  previously  studied  by  the  author 
(Symes  [1991a],  [1991c],  Symes  and  Carazzone  [1991],  Symes  [1993]).  Sup¬ 
pose  that  a  scalar  field  in  three-dimensional  space-time  is  sampled  at  every 
point  in  the  interval  [— X ,  X]  on  the  x-axis: 

z(x,  t)  =  U(x,  0,  t)  —  X  <  x  <  X  . 

Suppose  moreover  that  U  is  a  priori  known  to  be  a  plane  wave  moving  at 
speed  1,  except  possibly  for  some  noise: 

U (x,  y,  t)  =  u(t  —  x  sin  9  —  y  cos  9)  . 

The  goal  of  the  plane  wave  detection  problem  (or  detection  problem,  for 
short)  is  to  estimate  the  waveform  u(t)  and  the  incidence  angle  0  (or  equiv¬ 
alently  its  sine  5  =  sin0),  given  the  data  z(x,t). 

Because  one  believes  that  the  noise  in  the  measurement  z  is  small  in  the 
mean-square  sense,  or  for  statistical  reasons  (e.g.  Tarantola  (1987)),  one  may 
naturally  attempt  to  fit  a  prediction  of  2  in  the  mean-square  sense.  If  the 
waveform  is  u(t)  and  the  direction  sine  is  —  s,  one  predicts  the  measurement 

<f>[s,  it]  =  u(t  +  sx)  . 

In  this  notation,  one  attempts  to  find  [s,u]  to  solve  the  output  least-squares 
problem 

min  ||d>[s,  u]  —  z ||2 

5, It  1  " 

where  ||  ||  denotes  the  L 2  norm  on  a  suitable  domain.  This  formulation  may 
be  attacked  numerically  after  suitable  discretization.  Note  that  <f>  is  linear  in 
u  but  quite  nonlinear  in  s. 

Consider  for  a  moment  a  more  general  class  of  problems,  in  which  physical 
theory  connects  a  set  of  model  parameters  {m}  to  a  set  of  data  {z}  through 
a  mapping:  z  —  4>{m)  (in  the  plane  wave  detection  problem,  e.g.  m  ~  [s,u]). 
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Whether  through  solution  of  a  least-squares  problem  or  by  some  other  means, 
one  obtains  an  estimate  m  —  S[z]  of  the  model  from  the  data  z.  In  this  paper, 
we  take  the  point  of  view  that  such  an  estimator  S  is  satisfactory  if  it  has 
the  following  properties: 

(i)  if  z  is  consistent ,  i.e.  2  =  4>[m\  for  m  in  an  a  priori  prescribed 
admissible  set  of  models,  then  m  =  S[z]; 

(ii)  S  is  locally  Lipschitz  continuous,  and  is  well-defined  on  a  neighbor¬ 
hood  of  the  set  of  consistent  data,  in  the  sense  of  suitable  metrics; 

(iii)  S  is  computable  by  means  of  local  (Newton-type)  mathematical 
programming  techniques,  applied  to  some  (constrained  or  uncon¬ 
strained)  optimization  problem. 

Our  main  results  are  that  the  output-least-squares  problem  stated  above 
cannot  produce  an  estimator  with  these  properties,  and  that  a  variant  on 
output-least-squares  does  produce  an  estimator  satisfying  these  conditions. 
We  call  the  variant  the  differential  semblance  optimization  method  —  the 
reasons  for  this  terminology  will  be  evident. 

We  will  shortly  describe  the  results  in  more  detail,  but  first  we  offer  a  few 
comments  concerning  conditions  (i)-(iii). 

Condition  (i)  serves  to  connect  the  estimator  and  the  estimation  problem 
(i.e.  with  the  “physics”).  It  is  certainly  very  stringent  —  for  example,  we 
could  replace  (i)  by  the  requirement  that  a  family  of  estimators  be  given 
which  arbitrarily  well  approximate  a  model  from  its  (consistent)  data  set. 
Such  families  are  produced  by  numerical  methods,  of  course,  and  also  by 
regularization  of  ill-posed  estimation  problems,  for  example  (see  Tarantola 
[1987]).  It  might  even  be  satisfactory  that  a  (linear  or  nonlinear)  projection 
of  the  model  be  reproduced.  We  will  stay  mostly  with  the  strong  version  of 
condition  (i)  for  the  sake  of  simplicity. 

Condition  (ii)  could  be  paraphrased:  “stability  for  low-noise  data  sets.” 
It  guarantees  that  one  is  rewarded  for  efforts  in  the  direction  of 

•  more  accurate  data  collection 

•  more  accurate  basic  physical  modeling. 
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It  is  motivated  by  the  presumption  that  the  theory  is  accurate,  i.e.  that 
models  exist  which  predict  experimental  data  up  to  small  errors  —  small 
in  the  sense  of  the  “suitable  norms”  mentioned  in  the  statement.  Repeated 
experiments  then  yield  necessarily  data  sets  with  small  differences.  Obviously 
this  necessary  “small  scatter”  condition  is  experimentally  verifiable  (for  a 
given  choice  of  norm),  at  least  in  principle.  Condition  (ii)  asserts  that  the 
model  estimates  should  have  differences  of  sizes  proportional  to  the  sizes  of 
the  data  differences.  Thus  the  smaller  the  measurement  errors,  the  more 
unambiguous  the  model  estimate. 

We  shall  have  little  to  say  about  the  “high-noise”  case,  ie.,  when  either  the 
data  tends  to  be  very  inaccurate,  or  the  model  grossly  incomplete,  or  both. 
We  shall  assume  that  the  signal  is  close  to  that  which  would  be  produced  by 
a  plane-wave,  and  ask  that  the  direction  sine  and  wave  form  be  estimated 
with  comparable  accuracy. 

Weaker  notions  of  continuity  could  be  used,  but  it  is  not  clear  that  these 
would  be  as  useful  in  practice  as  Lipschitz  continuity,  which  is  a  qualitatively 
maximal  notion  of  stability. 

Note  that  nothing  is  said  in  conditions  (i)  and  (ii)  about  the  statistical 
nature  of  data  or  estimation  noise  —  only  its  size  is  addressed.  Statistical  as¬ 
sumptions  about  the  data  would  doubtless  entail  consequences  for  the  model 
statistics.  Rather  ambitious  attempts  have  appeared  recently  to  character¬ 
ize  the  solution  of  inverse  problems  in  terms  of  subjective  probabilities,  or 
Bayesian  statistics  (see  especially  Tarantola  [1987],  Menke  [1984]).  We  shall 
not  discuss  this  “standard  inverse  theory”  here,  or  otherwise  offer  much  seri¬ 
ous  argument  about  the  epistemology  of  inverse  problems.  Our  goal  here  is 
modest  and  mathematical:  we  want  merely  to  show  that  some  formulations 
of  our  simple  model  problem  have  properties  (i)-(iii)  above,  whereas  others 
do  not. 

A  part  of  condition  (iii)  is  the  presumption  that  S  is  defined  by  the 
solution  of  an  optimization  problem.  The  rest  of  (iii)  demands  that  this 
optimization  problem,  which  may  incorporate  model  constraints,  be  solved 
by  local  mathematical  programming,  i.e.,  methods  that  find  local  minima. 
Therefore  we  must  regard  the  collection  of  all  local  solutions  of  the  defining 
optimization  problem  as  the  “value”  of  S:  that  is,  S  is  multi-valued  when 
the  defining  optimization  problem  possesses  multiple  local  minima.  In  this 
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case  the  stability  property  (ii)  must  be  reinterpreted.  The  largest  possible 
error  in  the  estimated  model  is  the  radius  of  the  smallest  ball  containing  all 
values  of  S,  i.e.,  all  local  optima.  So  the  stability  property  (ii)  must  mean 
that  the  radius  of  this  ball  is  proportional  to  the  data  error.  In  particular,  for 
noise-free  (consistent)  data,  there  must  exist  exactly  one  local  (hence  global) 
minimizer  (as  is  also  implied  by  (i)  and  (iii)  together). 

Condition  (iii)  is  motivated  by  the  role  of  the  detection  problem  as  a  sim¬ 
ple  relative  of  a  number  of  inverse  problems  in  wave  propagation,  with  which 
it  shares  central  analytical  properties.  For  these  latter  problems,  the  scalar 
parameter  s  is  replaced  by  a  vector  in  a  high- dimensional  space,  and  eval¬ 
uation  of  the  analogue  of  the  model-to-data  map  4>  (and  of  its  derivatives) 
is  very  computation-intensive,  even  with  present-day  supercomputers.  The 
vastly  greater  efficiency  of  smooth  local  optimization  (quasi-Newton)  meth¬ 
ods,  as  compared  to  exhaustive  or  Monte-Carlo  search,  or  to  non-smooth 
(subgradient)  techniques  motivated  (iii):  these  latter  options  are  simply  out 
of  the  question  for  the  more  complex  problems  for  which  plane  wave  detection 
is  a  model. 

Our  first  main  result  (Section  1)  is  that,  for  a  large  class  of  obvious  choices 
of  metric  in  the  model  space  {m  =  [.s,u]},  the  output  least  squares  problem 
stated  above  does  not  define  an  estimator  S  for  the  detection  problem  with 
the  properties  (i)-(iii)  above.  We  choose  the  L 2  topology  for  the  range  of  <j> , 
as  is  implied  in  the  setting  of  the  least  squares  problem,  and  a  subspace  of 
IR  x  Hk[0,T],  k>  0,  as  the  domain.  We  show  that,  depending  on  domain 
metric  (i.e.  on  s),  either  the  objective  function  fails  to  be  differentiable,  or 
its  Hessian  fails  to  be  coercive.  Thus  the  standard  sufficient  conditions  for 
superlinear  convergence  of  Newton’s  method  are  violated,  i.e.  condition  (iii) 
appears  to  be  violated  (Dennis  and  Schnabel  [1983],  Ch.  5).  Moreover,  we 
show  by  explicit  example  that  the  least  squares  functional  can  have  many 
local  minima  even  for  consistent  data,  so  that  condition  (ii)  cannot  possibly 
hold. 

Our  second  main  result  (Section  2)  is  that  a  reformulation  of  the  output 
least  squares,  involving  enlargement  of  the  model  space  and  introduction 
of  new  constraints,  defines  an  estimator  with  properties  (i)-(iii)  above.  To 
motivate  this  reformulation,  note  that  the  output  least-squares  formulation 
demands  that  the  signals  z(x,  •)  be  fit  simultaneously,  for  many  values  of  x , 
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by  the  predicted  signal  u(t  +  sx)  based  on  a  single  input  waveform  u(t).  If 
2  is  at  all  oscillatory,  a  good  fit  is  only  possible  (if  at  all)  when  the  slope 
s  is  very  nearly  correct.  On  the  other  hand,  fitting  the  data  for  a  single 
value  of  x  is  easy  —  in  fact,  any  s  and  u(t)  =  z(x,t  —  sx)  will  do  the  job! 
Equally  well,  it  is  easy  to  fit  z(x,t)  with  the  x-dependent  waveform  u(x,t) 
regardless  of  the  slope  s.  In  allowing  the  waveform  u  to  depend  on  x  we 
have  lost  the  “physics”  of  the  problem,  of  course:  that  is,  the  assumption 
that  the  field  sampled  to  produce  (f>[s,  iz]  is  a  plane  wave.  We  reintroduce  this 
assumption  in  a  measured  way  by  means  of  the  objective  function  (of  s  and 
the  x-dependent  waveform  u(x,t)): 


d_ 

dt 


0[s,u] 


z 


+  <72 


du 

dx 
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where  ||  ||  is  the  L 2  norm  and  u  ranges  over  a  suitable  subspace  of  H1([—X,  X]  x 
[0,T])  (precise  choices  are  identified  in  Section  2.)  We  claim  that  this  ob¬ 
jective  function  yields  an  estimator  satisfying  the  criteria  (i)-(iii),  for  proper 
choice  of  a  >  0  and  minimization  strategy. 

We  draw  the  reader’s  attention  to  the  role  of  the  parameter  a.  It  is  a 
penalty  parameter,  enforcing  a  problem  constraint  in  a  soft  way.  It  is  not 
a  regularization  parameter,  and  does  not  exhibit  the  “L-curve”  influence 
on  the  solution  typical  of  regularization  weights.  Some  information  on  the 
dependence  of  the  solution  on  a  is  given  in  Section  2,  but  much  remains  to 
be  explained. 

Note  that  we  have  implicitly  changed  variables  by  introducing  the  t- 
derivative  in  the  data  misfit  term.  Specifically,  if  ( s,u )  is  a  minimizer  of 
the  above  functional,  then 


1  fx  ,  du 

M"2 x]-xdxTt(x'l) 


is  an  estimator  for  the  plane-wave  detection  problem.  Note  that  if  z(x,t)  = 
z*(x,t)  :=  u+{x  +  s,t)  is  consistent  data  for  u*  6  ^ioc>  then 

s  —  sm  ,u(x,t)  =  [  dt'uAt') 

Jo 

minimizes  the  above  functional  —  in  fact,  its  value  is  =  0.  Moreover,  the 
estimator  just  defined  gives  (s.,  tz»)  as  the  estimated  solution  of  the  detection 
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problem.  Thus  the  new  formulation  is  consistent  —  that  is,  it  has  property 
(i)  above. 


The  demonstration  that  this  estimator  possesses  properties  (ii)  and  (iii) 
is  more  involved.  For  a  variety  of  subspaces  H  C  Hl([—X,X]  x  [0,T]),  the 
problem 


min 
u  eH 


Z 


+  CT2 


du 

dx 


has  a  unique  solution  u[s]  depending  continuously  on  the  data  2.  Set 


Ja[s;z]  =  i 


d_ 

dt 


^>[s,u[s]]  -  2 


+  « 


dx 


In  Section  2  we  show  that  J„  is  smooth  over  s  G  (—1,1)  so  that  its  local 
minima  may  be  found  efficiently  by  variants  of  Newton’s  method.  Next  we 
investigate  the  set  of  critical  points  of  Ja.  Let  e  >  0  be  the  error  level  of  the 
datum  2  relative  to  a  model  [$*,«*],  i.e. 

_  =  /  lilzilfVdJil 
l  INI  J  ‘ 

We  show  that  the  diameter  of  the  set  of  critical  points  of  Ja  is  proportional 
to  e/cr,  provided  that  a  is  sufficiently  small  and  the  “target  signal”  u*  is 
sufficiently  oscillatory.  These  observations  (Theorems  2.9,  2.10)  establish 
properties  (ii)  and  (iii)  for  the  estimator  constructed  by  local  minimization 
of  J^,  applied  to  oscillatory  data.  In  particular,  the  solution  map 

/  r  i  1  fX  £?u[s[2]]  A 

Ml-* 

can  be  computed  reliably  by  smooth  optimization  methods  for  low-noise  data 
sets,  and  is  Lipschitz  continuous  (in  the  sense  of  set- valued  maps).  J p  has  a 
unique  critical  point,  i.e.  the  global  minimizer,  if  e  =  0:  for  consistent  (noise- 
free)  data,  Ja  is  quasiconvex.  Note  that  quasiconvexity  is  not  necessary  in 
general  for  an  estimator  to  have  properties  (i)-(iii). 

The  analysis  of  the  gradient  in  Section  2  yields  a  fairly  precise  description 
of  the  shape  of  Ja  as  a  function  of  a.  For  noise-free  oscillatory  data,  small 
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<r  implies  quasiconvexity.  For  large  a  on  the  other  hand  we  show  that  Ja 
becomes  quite  flat  away  from  the  global  minimizer  (Theorem  2.9(a)),  and  in 
fact  then  closely  resembles  the  output  least  squares  objective  in  its  general 
form,  as  one  would  suspect.  Therefore  to  retain  properties  (i) — (iii)  cr  must 
be  kept  bounded  away  from  both  0  and  oo.  The  estimates  of  Section  2  are 
qualitative  in  nature.  Precision  in  the  constants  appearing  in  these  estimates 
would  have  little  value.  Instead,  an  adaptive  algorithm  for  adjustment  of  a 
is  needed.  We  do  not  give  such  an  algorithm  here. 

The  results  of  Section  2  are  illustrated  by  a  number  of  numerical  experi¬ 
ments  discussed  at  the  end  of  the  section. 

The  term  (T2\\du/ dx\\2  measures  the  extent  to  which  the  waveforms  u( •  ,  x ) 
resemble  each  other  for  nearby  values  of  x.  We  call  this  term  a  differential 
semblance  measure,  and  a  the  differential  semblance  weight. 

As  mentioned  earlier,  the  planewave  detection  problem  is  studied  here 
mainly  as  a  simple  model  for  various  inverse  problems  in  hyperbolic  partial 
differential  equations.  Versions  of  differential  semblance  optimization,  anal¬ 
ogous  to  minimization  of  J„ ,  have  been  applied  to  a  variety  of  such  problems 
by  the  author  and  his  associates;  see  the  references  cited  at  the  beginning  of 
this  section. 

Norms  in  the  T2-based  Sobolov  spaces  Hk  will  be  denoted  by  ||  ||*.  Thus 
II  ||o,  or  simply  ||  ||,  is  the  norm  in  L2.  Domains  are  either  evident  from 
context  or  specified  explicitly.  Boundary  conditions  will  be  introduced  as 
needed  and  specified  by  subscripts. 
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1  Output  Least- Squares  Detection 


We  begin  this  section  by  formulating  the  plane-wave  detection  problem  pre¬ 
cisely  as  a  nonlinear  least-squares  problem.  We  find  immediately  that  in 
order  to  fulfill  the  standard  conditions  for  convergence  of  Newton’s  method, 
we  must  severely  restrict  the  smoothness  of  the  signal  u(t)..  Having  made 
this  restriction,  we  then  find  that,  even  so,  the  least  squares  cost  functional 
has  multiple  local  minima,  which  can  be  as  close  together  as  we  like  —  even 
for  noise-free  data!  As  pointed  out  in  the  introduction,  this  means  that  the 
least  squares  formulation  cannot  be  regarded  as  well-posed,  or  alternatively 
that  the  amount  of  regularization  necessary  to  render  the  problem  well-posed 
is  bounded  away  from  zero,  even  when  the  data  error  vanishes. 

We  shall  assume  that 

(i)  the  wave  form  and  signal  are  square-integrable; 

(ii)  either 

fT 

a)  the  waveform  u  is  periodic  with  period  T,  and  /  dt  u(t )  =  0 


b)  the  support  in  t  of  u(t  +  sx)  is  known  a  priori  to  be  contained 
in  the  interval  [0,  T)  for  any  s  £  [—1, 1]  and  x  6  [— X,  X] 

where  [— X,  X]  is  the  interval  of  recording  positions,  and  [0,T]  is  the  time 
interval  of  the  recording.  Note  that  the  direction  parameter  s  is  necessarily 
in  the  interval  [—1, 1].  Cases  (a)  and  (b)  in  assumption  (ii)  correspond  to 
the  two  most  common  plane  wave  signals:  periodic  and  impulsive.  Under 
our  assumption,  we  can  view  (b)  as  a  special  case  of  (a)  by  extending  u  T— 
periodically,  but  we  prefer  to  single  it  out  to  emphasize  the  more  common 
impulsive  case.  Assumption  (ii)  ensures  that  we  have  sufficient  amount  of 
information  of  the  plane  wave  at  each  recording  position.  The  second  part  of 
assumption  (iia)  implies  that  u  is  oscillatory.  We  will  not  use  this  assumption 
until  the  next  section.  The  signal  resulting  from  the  waveform  u  at  direction 
s  is 

<f>[s,  u](x,  f)  =  u(t  +  sx)  . 
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We  may  regard  ^  as  a  mapping 


<t>  :  [-1, 1]  x  L2[ 0,  T]  -  L2(  [-X,  X }  x  [0,  T] )  . 


The  output  least  squares  formulation  of  the  plane  wave  detection  problem  is: 


given  data 


z6L2([-X,X)x[0,T}) 


find 


(M)6[-i,i]xf2[o,r] 


to  minimize 

J[s,u;z]  =  ||<£(s,u)  -  z\\20  . 

Suppose  that  we  regard  the  minimization  of  J  as  equivalent  to  finding  a  sta¬ 
tionary  point  of  its  gradient,  i.e.  we  ignore  the  possible  presence  of  stationary 
points  other  than  minima.  Formally, 


grad  J[$,  u\  z ]  =  2 D<j>[s,  u]T(<^[s,  u ]  —  z) 


where 

du 

D<f>[s,  u][<$s,  <5u](x,  t)  =  8u{t  +  sx)  +  8sx—(t  -f  sx)  . 

dt 

The  presence  of  a  derivative  on  the  right-hand  side  of  the  last  equation  sug¬ 
gests  that  D<f>  is  not  well-defined  at  general  [s,u]  6  [-1,1]  x  L2[0,T].  In 
fact  it  is  easy  to  show  that,  while  <j>  as  defined  above  is  continuous,  it  is  not 
locally  uniformly  continuous,  hence  a  fortiori  not  differentiable. 

If  we  are  to  employ  smooth  optimization  methods  at  all,  we  must  change 
the  definition  of  <f>,  either  by  strengthening  the  topology  on  the  domain  or 
weakening  the  topology  on  the  range.  We  choose  to  maintain  the  L2  measure 
of  error  on  the  range.  Then  the  signal  u(t )  must  have  a  derivative  in  the  sense 
of  L2  in  order  that  the  derivative  D<j)  of  (j>  exist.  We  shall  state  the  following 
result  in  a  fashion  appropriate  to  the  “impulsive”  case  ((iib)  above).  Similar 
statements  can  be  made  concerning  the  periodic  case  ((iia)  above);  we  leave 
the  formulation  of  these  to  the  reader.  The  proof  of  the  following  result  is 
straightforward: 
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Theorem  1.1  For  A:  =  0,1,2,...,  regard  (f>  as  a  map 

*  :  [-1. 1]  X  ff0‘[ 0,  rj  -  L\  [-X, X]  x  [0,  T\ ) , 

then: 


(a)  <f>  is  of  class  C J  1,1  if  and  only  if  k  >  j . 

( b )  the  gradient  of  J  is  given  by 

gradJ[s,u;z]  =  [ gs,gu ] 


where 

fx  fT  d 

9s  =  2  J  x  doc  J  dt  x—u(t  +  s:r)  ( u(t  +  sx)  —  z(x ,  t)) 
and  gu  is  the  solution  of  the  boundary-value  problem  (in  weak  sense) 

Q^2i9n(t)  =  2  Xu(t)  -  J  ^dx  z(x,t-  sx) 

with  boundary  conditions 

d  dk~x 

9u(t)  ^<7u(0  =  •  •  •  d£k—i 9ui^)  =  0  at  t  =  0,T  . 

We  digress  to  recall  the  standard  conditions  for  local  quadratic  conver¬ 
gence  of  Newton’s  method  for  the  equation  F(x)  =  0  to  a  solution  x »  G  U, 
where  F  :  U  — >  Y ,  U  C  X  open,  X  and  Y  are  Banach  spaces: 

(1)  F  should  be  of  class  CX,X(U,  T),  i.e.  be  continuously  differentiable 
with  Lipschitz-continuous  derivative; 

(2)  the  Jacobian  map  F\x+)  should  be  coercive,  i.e., 


|F'(x,)^a:jy'  >  C\8x\x 

(see  e.g.  Dennis  and  Schnabel  [1983],  Theorem  5.2.1).  We  wish  to  take  gradJ 
for  F,  one  of  the  spaces  [-1, 1]  x  Hk[0,T]  for  X,  and  L2(  [-X,  X]  x  [0,T]  ) 
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for  Y.  For  grad  J  to  be  of  class  C1’1,  it  is  sufficient  that  <f>  is  of  class  C2,1 
which  requires  k  >  3,  according  to  Theorem  1.1.  In  fact  it  is  not  hard  to 
prove  that  k  >  3  is  also  necessary  (at  least  so  long  as  k  is  restricted  to  be  an 
integer).  Accordingly  we  shall  regard  <f>  as  a  map 

^  :  [-1,1]  x  ff|[0,T]  —  L\  {-X,X\  x  [0,T] )  . 

The  Jacobian  of  the  gradient  is  the  Hessian  map.  Its  action  on  a  vector  of 
the  form  [0,  <5u]  is  given  by 


#[s,u][0,<!>u]  =  huu  , 

where  huu  is  the  solution  of  the  boundary-value  problem: 

D-i)‘£4»W  =  2«“  w 

t=0  01 

d  a p 

huu{t)  =  huu(t)  =  —j^  huu(t)  =  0  at  t  =  0,  T  . 

Evidently  H  is  a  smoothing  operator  H~3[0,T]  — >  Hq[0,T].  Thus  the  spec¬ 
trum  of  H  accumulates  at  zero,  and  the  second  sufficient  condition  (coer- 
civity)  fails.  This  implies  that  the  cost  functional  might  not  be  globally 
convex,  thus  multiple  minima  might  be  present.  This  conjecture  is  proved  in 
Symes  [1991b],  and  illustrated  by  the  following  very  simple  example.  In  this 
example  we  view  cf)  once  again  as  a  map 

<t>  ■  [-1,1]  x  L2[0,T]  -  L2([-X,X]  x  [0,T]) 

as  the  choice  of  norm  in  the  domain  does  not  affect  the  properties  illustrated 
here. 

Let  X  =  T  =  7r,  and  z(x,t )  =  sin ujt  be  consistent  and  noise- free  data. 
It  is  easy  to  show  via  normal  equation  that,  given  s,  the  unique  minimizer 

it[s]  =  argminuJ[s,u;  z] 


is 


rx 

u[s](f)  =  /  dx  sin  u(t  —  sx) 

J — x 
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and  straightforward  calculations  lead  to 


7T 

^OptW  :=:  •^[siu[lS]]  =  ~ 

Figure  1.1  and  Figure  1.2  are  graphs  of  Jq  with  u  =  50  and  ui  =  500  respec¬ 
tively.  A  quick  examination  of  J[s]  shows  that  this  function  has  as  many 
local  mimima  as  one  likes,  and  they  can  be  as  close  as  one  likes  by  choosing 
ui  sufficiently  large.  For  any  local  minimizer  s*  of  J[s],  (s*,u[s*j)  is  a  local 
solution  of 

min||^[s,ti]  -  z\\l  . 

Hence,  this  output  least  squares  problem  can  have  as  many  solutions  as  one 
likes,  they  spread  as  wide  as  one  likes,  and  they  can  be  as  close  as  one  likes 
too,  by  just  choosing  u>  large  enough!  Keep  in  mind  that  z  is  smooth  and  of 
class  L 2  for  any  u>,  and  it  is  consistent  and  noise-free!  Therefore  the  solution 
of  the  output  least-squares  problem  cannot  be  stable,  in  the  sense  of  the 
discussion  in  the  introduction. 

This  example  also  reveals  a  typical  feature  of  the  output  least  squares 
detection  problem  for  bandlimited  data:  if  the  frequencies  of  the  data  con¬ 
centrate  near  u ;,  then  the  size  of  the  neighborhood  of  the  global  minimizer 
of  J[s]  in  which  J[s]  is  convex  is  of  O(^).  Outside  of  this  neighborhood,  J[s] 
tends  to  be  flat  (“saturated”),  with  many  local  minima. 

Another  example  of  this  nature  is  the  output  least  squares  detection  of  a 
plane  wave  produced  from  the  Ricker  wavelet  (i.e.  the  second  derivative  of  a 
Gaussian  pulse  —  see  Figure  1.3).  Figure  1.4  is  the  noise-free  sampling  of  the 
plane  wave  on  spatial  range  [—0.5, 0.5]  in  a  time  interval  [0, 1.0],  and  Figure 
1.5  is  the  sampling  with  160%  noise.  There  are  41  receiver  signals  (“traces”) 
in  both  instances,  each  with  512  samples.  The  time  shifts  of  the  sampled 
input  wave  form  are  performed  via  the  Fourier  transform.  All  calculations 
were  done  in  MATLAB. 

Figure  1.6  and  Figure  1.7  are  the  corresponding  numerically  computed 
mean  square  errors  J[s ]  vs.  direction  s.  Both  curves  are  nonconvex  and  very 
sharp  at  their  global  minima  s  =  0.1.  Success  in  minimizing  such  functions 
relies  heavily  on  a  good  initial  guess  or  on  a  global  search.  Note  also  that 
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Figure  1.7  shows  that  with  noise-corrupted  data,  J[s]  tends  to  have  multiple 
minima. 
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2  Differential  Semblance  Optimization 

In  this  section  we  introduce  an  alternative  to  the  output  least-squares  func¬ 
tional,  which  we  have  named  differential  semblance  optimization  (DSO).  We 
develop  the  basic  regularity  and  convexity  properties  of  the  new  objective 
function,  then  present  some  numerical  examples.  We  begin  with  some  re¬ 
marks  intended  to  motivate  the  new  approach. 

As  the  plots  of  residuals  in  the  last  section  suggest,  one  of  the  difficulties 
faced  by  output  least-squares  is  the  necessity  of  fitting  all  of  the  data  traces 
simultaneously  with  a  single  waveform  u  and  slope  s.  On  the  other  hand  the 
data-fitting  problem  for  a  single  trace: 

min  J[u(t  +  sx)  -  z(x ,  t)]2dt 

while  nonconvex  for  the  same  reasons  as  before,  is  nonetheless  trivial  to  solve. 
In  fact,  zero  residual  can  always  be  achieved  simply  by  setting 

u(t)  =  z(x,  t  —  sx) 

and  this  residual  is  independent  of  s,  in  contrast  to  the  multi-trace  case 
considered  in  the  last  section.  Alternatively,  suppose  that  the  models  for  each 
trace  are  regarded  as  independent,  i.e.  u  is  allowed  to  depend  on  x  :  then, 
once  again  zero  residual  is  achieved,  independently  of  s,  through  solution  of 
the  quadratic  problem 

min  J  J  dxdt  |u(i,f  4-  sx)  -  z(x,t)\2 

the  solution  of  which  is 

u(x,  t)  =  z(x,  t  —  sx) 

with  no  restriction  on  s. 

Of  course  the  price  paid  for  this  complete  relaxation  of  the  data-fitting 
problem  is  the  complete  loss  of  the  plane-wave  model,  which  is  the  “physics” 
of  the  plane-wave  detection  problem. 

The  plane-wave  model  is  retrieved  by  imposition  of  any  constraint  forcing 
u  to  be  independent  of  x;  for  example,  minimizing  the  above  function  under 


14 


the  constraint 


is  apparently  equivalent  to  the  output  least-squares  problem  of  Section  1 
and  therefore  will  not  be  amenable  to  solution  through  versions  of  Newton’s 
method  or  its  relatives  adapted  to  equality-constrained  problems,  for  instance 
sequential  quadratic  programming. 

The  following  reasoning  suggests  our  resolution  of  this  difficulty.  We  view 
<f>  as  a  mapping 


<t> :  [-1, 1]  x  L2  ([-X,  X }  x  [0,  T])  —  L2  ([- X ,  X]  x  [0,  T)) 
^[s,  u](x,  t)  =  u(x,  t  +  sx) 

and  solve  the  detection  problem  in  two  steps: 

Step  1.  Solve 

min  ||^[s,  u]  —  z ||2 

to  get  it[s]; 

Step  2.  Solve 

d 

min  —  u 
3  ox 

to  get  6,,  hence  u[s,].  It  is  easy  to  see  that 


u[s](£,  x)  =  z(x,t  —  sx) 


A*] 


dz  dz  2 
dx  S  dt 


and 


/  dz  dz  \ 
>  dx  ?  dt  f 
/  d_z_  dz  \ 
\dt  ’  dt/ 


where  (• ,  •)  is  the  inner  product  on  L2(  [-X,  X]  x  [0,  T] ),  and  we  used  the  fact 
that  <j>[s,  •]  is  an  isometry  on  this  space  under  the  assumption  (iib)  in  Section 
1.  Note  that  Step  1  is  easily  solved  via  the  normal  equation,  which  is  a  linear 
equation,  and  in  Step  2  J[s]  is  perfectly  smooth  and  convex.  The  obvious 
problem  is  that  one  needs  to  compute  the  derivatives  of  z,  which  are  generally 
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not  square  integrable.  In  fact,  evidently  u  (which  must  be  differentiated)  is 
only  as  smooth  as  z,  which  doesn’t  have  a  derivative.  We  resolve  this  conflict 
by  parametrizing  the  waveform  as  du/dt ,  rather  than  as  u,  minimizing  the 
penalized  functional 


1 

d  r 

2 

d 

a 2 

—  <j>[s,u]~z 

+ 

in  two  steps  as  above.  The  change  of  unknown  ensures  that  the  presumption 
that  2  €  f2  will  not  be  violated.  We  will  obtain  a  linear  well-posed  problem 
for  u  (Step  1).  Moreover  the  minimum  (over  u )  is  a  smooth,  globally  convex 
function  of  s  if  2  is  close  to  a  consistent  datum  and  a  is  sufficiently  small.  In 
this  case,  at  the  stage  of  Step  2,  the  first  term  in  the  functional  will  be  small 
and  not  destroy  the  convexity  of  the  second  term.  Note  also  that  in  the  limit 
&  —*  0  this  problem  becomes  constrained  by  the  first  term,  therefore  reduces 
to  the  problem  discussed  above. 

That  is,  for  small  <7,  this  problem  resembles  the  minimization  of  \\du/dx\\l 
subject  to  the  constraint  that  u  predicts  the  data  correctly.  The  x-derivative 
of  u  measures  the  extent  to  which  the  waveforms,  chosen  to  explain  signals 
at  neighboring  receivers,  resemble  each  other.  This  idea  motivates  the  name 
differential  semblance. 

On  the  other  hand,  as  <7  — >  00  the  problem  tends  to  resemble  the  min¬ 
imization  of  the  mean-square  error  subject  to  the  constraint  du/dx  =  0, 
which  is  of  course  exactly  the  output  least-squares  problem  of  the  previ¬ 
ous  section.  Thus  the  differential  semblance  functional  interpolates  between 
the  “pure  semblance”  problem  described  above  (<7  — +  0)  and  output  least 
squares  (<7  — >  00).  This  feature  will  be  illustrated  numerically  at  the  end  of 
the  section. 

We  begin  the  development  of  these  results  by  defining  some  useful  func¬ 
tion  spaces. 

Define  the  extended  model  space  E  by 


H]  =  ju  €  Hl  (  [— X ,  X]  x  [0,  T] )  :  u(x,  0)  =  u(x,  T) 
\x\  <  X  ;  J  dt  J  dx  u  =  0  .| 
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E  =  IR  x  Hi  . 

We  define  a  family  of  norms  on  E  by 

Mil,  =  (Ml2  +  IK.) 

where 


for  x  =  [5,  u\  £  E. 

The  norm  is  definite  for  a  >  0  because  of  Poincare’s  inequality  (Gilbarg 
and  Trudinger  (1983),  p.  164).  Define  the  extended  data  space  F  by 

F  —  F  x  F 


where  as  before  F  =  L2(  [— X,  X ]  x  [0,  T] ),  viewed  as  a  subspace  of  T-periodic 
functions  in  t,  with  the  obvious  norm.  Define  finally  for  a  £  1R+: 


d>„:E—>F 


by 


<j>c[s,u\(x,t) 


du 

—  [x,t  +  sx) 
du 

aYx{z’l) 


For  z  £  F,  let  z  =  [2,0]T  £  F. 

Remark.  The  reason  for  the  definition  of  using  du/dt ,  rather  than  u 
as  in  Section  1,  was  hinted  above  and  will  be  explained  further  below. 

Evidently  <f>„  is  once  again  quite  irregular  (continuous,  but  not  locally 
uniformly).  Motivated  by  the  remarks  at  the  beginning  of  the  section,  define 
nonetheless 

Jafs]  =  inf  |Uff[s,u]  -z II  • 

ueHl  11 
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Theorem  2.1  Ja[s ]  =  <^a[s,  u[s]  ]  —  5||^  where  u[s]  is  the  (weak)  solution 

of  the  boundary  value  problem 

(  d2  2  92  \  dz  , 

\ Kdt2+a  dx2)u(x,t^~  dt^x'1  sx ) 

u(x,0)  =  u(x,T),—  X  <  x  <  X 
dx  ^  =  dx  (X'  ^  S  °  ’  0  -  1  ~  T 


Also,  the  estimate 


holds. 


(X  rT 

/  dx  dt  u  =  0  . 
J-x  Jo 


l“Mlk<r  <  Iloilo 


Proof.  Denote  by  Hb  1  the  dual  space  of  Hb.  For  each  s  6  1R,  define  the 
operator  T„  on  F  by 

T,f{x,t)  =  f(x,t  +  sx) 

(interpreted  as  1-periodic  in  t).  T,  is  bounded  and  continuous  for  each  s  6  IR, 
and  the  following  relations  are  easily  verified: 

Tj f(x,t)  =  f(x,t-  sx) 

TjTs  =  /  . 

Since  the  derivatives  d/dt  and  d/dx  map  H\  — »  F  continuously,  we  can  view 
their  transposes  (formal  adjoints)  as  maps  F  — >  Hf1.  Finally,  define  for  each 
s  €  IR  the  map 

Hi  -  F 


^3,<T^  —  4><t[S1U\  • 

Evidently  $Si<T  is  a  bounded  linear  map,  and 

'  T  —  ' 

$s,ou  =  §*  ,  it  e  Hi 
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Moreover, 


d 


d 


(^J  T°T°dt  +  a 

=  (  92  ,  -2  9 

\dt 2  3x2 


'jT 

dx 


_d_ 

dx 


$Lz  =  ~T.z 


$,cr 


d_ 

dt 


where  we  have  identified  ( d/dt)T  with  —d/dt  as  usual  and  thus  regarded  the 
Laplace  operator  as  a  map:  Hi  — ►  H^1 . 

The  objective  function  of  the  variational  problem  posed  in  the  statement 
of  the  theorem  can  be  written  as 


||$s,<rU  -  z\\p 

=  —  2($Si<tu,  z) p  +  ||z||p 

=  \\^s,<7u\\p  —  2(u,^J  +  \\z\\2p  . 

o  o 


The  first  term  defines  a  coercive  quadratic  form  on  Hi ,  thanks  to  Poincare’s 
inequality.  The  second  term  defines  a  continuous  linear  form  on  Hi.  Accord¬ 
ing  to  a  basic  result  in  the  calculus  of  variations  (the  Lax-Milgram  Theorem, 
see  e.g.  Gilbarg  and  Trudinger  (1983),  p.  83),  the  objective  function  attains 
a  (global)  minimum  at  u  £  Hi  solving  the  normal  equations 

*l**s,«U  =  *Z,Z  . 

The  translation  into  the  boundary  value  problem  stated  in  the  theorem 
follows  the  standard  pattern  of  the  direct  method  (Gilbarg  and  Trudinger 
(1983),  Ch.  8). 


To  prove  the  bound,  suppose  first  that 
satisfies 


d_ 

dx 


(z(x,  t  —  sx)) 


=  0 


2  is  smooth,  T-periodic  in  t,  and 
if  x  =  ±X  . 


Then 


z(x,  t  —  sx)  = 


E 

n,  m  £  Z; 
|n|  +  |m|  >  0 


(  nmx 


cos 


2X 


+ 


rmr\ 
~2~  ) 


2-nint 

e  r 
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the  series  being  absolutely  convergent  together  with  all  derivatives.  Accord¬ 
ingly 

,  (i rmx  rmr\  „_int 

u(x,t)  =  >  umncos - 1 - e  r 

n,m£  Z;  \  2X  ^  2  ) 

Ini  +  Iml  >  0 


2vin/T 

«■(»)’ +(*)’* 


and  in  particular 


.  7rn  7rm 

M  j-i  ^mn|  _  )  [O’  ^mn  |  ^  l^mnl  • 

The  indicated  bound  follows  immediately  from  the  fact  that  TLS  is  an  isom¬ 
etry  of  F.  Note  that  more  generally  for  7  >  0 


\u[s]  ||1)7  <  ^  \\z\\  . 


q.e.d. 


If  we  denote  the  solution  operator  of  the  problem  in  Theorem  2.1  by  N„, 
then  similar  arguments  show  that  is  contractive  on  F,  i.e.  we  have 

Proposition  2.2  —  N^^u  <  ||«||  for  u  £  F. 

Theorem  2.3  J„  €  C°°(IR) 

Proof.  Write  the  objective  function  in  the  statement  of  Theorem  2.1  in 
the  form 

du  2  2  du  2 

T-m~z  F  +  <r  Txf- 

Note  that  Ta  and  d/dt  commute.  Also  Hi  C  F  C  Hjf1  and  Ta  and  its  formal 
adjoint  Tj  both  restrict  to  isomorphisms  of  Hi  interpreted  as  consisting  of 
1-periodic  functions.  Thus 


du  2  du 

T‘m~z  F  +  (r  Tx  F 
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d  2  du 

=  —Tsu-z  +  a2  — 
dt  F  ox 


,  2  ^  rpT 

—  Trr  —  2  +  cr  —T'V 


where  v  =  Tsu  (so  that  u  =  Tj v).  It  follows  that 

Ja[s}=  inf  (  +cr2  Tjv 

vem  dt  „  dx 


d  d 


Since  —  Tls  =  T*  (  ^ - s—  J  and  Tj  is  an  isometry  of  F,  it  follows  that 


veH'b  I  I  dt 


dv  dv  I 


J.W=inf ,  ^-2  +<T2  F- 


dx  dt 


The  problem  just  defined  is  also  a  convex  variational  problem,  as  its  quadratic 
part  is  also  coercive: 


dx  dt 


=  (1  +  s2a2) 


^  1  dv  a 2  dv 

~  2  dt  F+  1  +  2s2a2  dx 


dv  2 

dt  F~ 

2 

+  a2 
F 

dv 

dx 

<T2  II 

dv  |! 

—  2a  s 


dv  dv  \ 
dx  ’  dt) f 


So  in  exactly  the  same  way  as  before  it  attains  its  minimum  at  u[s]  €  Hi , 
given  by  u[s]  =  NSi<7z.  Here  N3 is  the  inverse  operator  of 


d\T  (  d 


dt  \dt 


d  d\  ( d  d 


+  <J  \Tx-sdi)  \d-x-sTt)\v=:M''°veH' 


where  the  transposes  are  taken  in  the  sense  of  F  —*  Hb  1 .  Thus 


J  (j  s  — 


<9u[s]  2 

~^r  ~z  +a 

dt  F 


du[s]  dv 
dx  S  dt 
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The  operator-valued  function 


is  (7  00  (linear!)  in  the  operator  norm,  therefore  so  is  its  adjoint  F  — >  //71, 
whence 

s  M.„  :  Hi  —4  H f1 

is  a  (7°°  operator-valued  function.  Since  its  values  are  invertible,  with  (lo¬ 
cally)  uniformly  bounded  inverse  Ns<a  it  follows  from  the  calculus  that 

s  >—►  Ns,a  :  H f1  — ►  Hi 


is  <7°°.  Therefore  s  — *  u[s]  =  Ns^z  €  Hi  is  C°°,  and  the  statement  of  the 
theorem  follows  immediately.  q.e.d. 

Remark.  In  a  prior  incarnation  of  this  work  (Symes  [1990]),  the  entire 
treatment  was  based  on  the  map 

dv 
dt 

dv  dv\ 

dx  dt) 


which  plays  a  major  role  in  the  proof  of  Theorem  2.3.  There  is  a  rough 
equivalence  between  variational  problems  based  on  $Sj(T  and  those  based  on 
$Si<T  though  only  in  the  case  of  t-periodic  models  does  it  seem  possible  to 
establish  an  exact  relationship. 

Remark.  We  will  sketch  here  another  proof,  which  works  when  the 
definition  of  the  map  $  incorporates  a  cutoff: 


with  g  €  C<j°(lR2).  In  fact,  this  problem  setting  is  more  prototypical  of  the 
seismic  inverse  problems  which  motivate  this  work,  which  require  a  similar 
approach.  It  is  natural  to  take  for  the  model  space  Hl(£l),  where  is  a 
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domain  large  enough  that  data  in  the  support  of  g  can  be  reproduced  with 
models  supported  in  fl,  for  the  range  of  slopes  s  considered.  Evidently  the 
normal  operator  of  $  is  no  longer  elliptic,  and  this  is  also  a  characteristic 
feature  of  the  reflection  inversion  problems.  It  is  necessary  to  regularize  the 
normal  operator;  in  the  present  case,  one  could  for  example  redefine 


$s,c 7, A  = 


A 

°  dx  dt 
x  F  x  F  . 


T 


Now  zero  residual  is  no  longer  possible,  as  usual  for  regularized  problems. 
The  normal  equations  now  read 


'd_ 

dt 


Tj  a2Tq 


d_ 

dt 


2d2  x2  d 

+  <T  S?  +  A 

u  =  0  on  <90  . 


u  — 


If  it  is  arranged  —  as  it  always  can  be  —  that  the  support  of  the  right-hand 
side  is  interior  to  0,  then  the  solution  operator  Ns^\  of  this  Dirichlet  problem 
differs  from  a  certain  pseudo-differential  operator  by  an  infinitely  smoothing 
error.  The  symbol  of  the  pseudodifferential  operator  Ns  a\  can  be  calculated 

d  _ 

explicitly,  and  when  u  =  -N3t<7,\—  Tjgz 

-  ^(z.gT,  —  —  —  Tjgz'j 


Ja[s]  =  -  (z,0,0)] 


FxFxF 


Each  of  the  terms  in  this  sum  involves  operators  of  the  form  T3  (pseudo  D.O.)  Tj. 
It  turns  out  that  such  operators  are  also  (essentially)  pseudodifferential  (this 
observation  is  related  to  Egorov’s  theorem  (Taylor  (1980),  pp.  147  if. ) ;  their 
symbols  can  be  computed  explicitly  and  are  smooth  functions  of  s.  It  follows 
immediately  that  J<r[s]  as  defined  above  is  smooth. 
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For  more  comment  on  this  reasoning  see  Symes  (1991c). 

Next  we  compute  the  gradient  of  J.  We  assume  temporarily  that  the 
data  2  is  smooth.  Then  u[s]  is  smooth  and  periodic,  and 

r  d  d  "I 

22  )  =  X  dt  T,1h  +^a/r-^u[s]. 

So 

—  J[s]  =  —  l|$*,<r«[5]  -  z\\2f 

=  2^— ($s>(Tn[s]),$Si<Tu[s]  -  5^ 

=  2(xf  T-%[s]'T4tu[s]-z) 

+  -  *) ) 

and  the  second  term  in  the  sum  vanishes  by  virtue  of  the  normal  equations. 

Q 

As  noted  before,  —  and  Ts  commute.  Note  also  that  (multiplication  by) 
x  and  Ta  commute,  so  we  obtain 

TsJ[s]  =  2{x  I  “w’  (T*  m)  ( T •  i  “W  - z) )  ■ 

We  will  need  the  following  technical  result. 

Theorem  2.4  Supppose  that  z  is  a  smooth  (C°°)  function,  1-periodic  in  t. 
Then  u[s]  and  all  of  its  s-derivatives  are  smooth  functions,  1-periodic  in  t. 


Proof.  Recall  that  the  r.h.s.  of  the  normal  equation  for  u[s]  is 
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Evidently  if  2  is  smooth  then  this  quantity  is  a  smooth  function  all  s- 
derivatives  of  which  are  also  smooth.  The  first  derivative  with  respect  to 
s  is 


d 

ds 


dz 

dt 


(x,  t  —  sx) 


d2z 


dz 

dt 


and  so  on.  Since  the  r.h.s.  of  the  normal  equations  are  smooth  in  s,  so  is  the 
solution.  The  same  reasoning  applies  to  all  higher  derivatives.  q.e.d. 

Now  suppose  that  2  6  C°°.  The  normal  equations  can  also  be  written 


so 


2d2u 
a  d^~ 


0 


ti  s 


where  we  have  used  the  assumed  smoothness  of  u[s]  to  take  advantage  of  the 
Neumann  boundary  conditions  in  integrating  by  parts. 


It  is  easy  to  see  that  the  operator  x  —  is  antisymmetric  on  smooth  func- 

•  ,  C /  L 

tions  periodic  in  t,  so  the  second  term  above  vanishes  and  we  are  left  with 
the  result 
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Theorem  2.5 


Proof.  We  have  established  this  formula  under  the  assumption  that 
z  e  F  was  actually  smooth.  However,  note  that  u[s]  is  a  continuous  Re¬ 
valued  function  of  z  E  F;  it  follows  that  the  formula  is  correct  in  general, 
q.e.d. 

Remark.  At  this  point  it  is  possible  to  justify  the  inclusion  of  d/dt  in 
the  definition  of  Indeed,  suppose  we  were  to  define  instead 


/  T,u  \ 

<Ms,u]  =  du 

\  a  dx  / 

and  regard  <j)a  as  mapping  //^-functions  of  x,  with  values  in  L2  1-periodic 
functions  of  t,  into  F.  The  normal  equations  may  then  be  written 


1  —  a 


dr 

dx 2 


u  =  T/z 


Exactly  the  same  calculation  leads  to  the  formula  stated  in  Theorem  2.5, 
under  the  assumption  of  smooth  z.  In  particular,  the  f-derivative  of  u  ap¬ 
pears,  because  the  s-derivative  of  u(x,  t  +  sx)  produces  it!  Now,  however,  the 
quadratic  form  in  z,  defined  in  Theorem  2.5,  does  not  extend  to  z  (E  F:  the 
gradient  of  T  is  only  defined  at  a  dense  set  of  z  €  F.  In  fact,  when  d/dt  is  not 
included  in  the  definition  of  (j>G  ,  the  normal  operator  is  not  elliptic,  and  does 
not  smooth  u  sufficiently  to  make  J[s]  smooth  in  s,  for  arbitrary  data  z  €  F. 
The  inclusion  of  d/dt  in  the  definition  of  on  the  other  hand,  induces 
exactly  the  necessary  amount  of  regularity  in  u[s]  to  make  J[s]  smooth. 

Next  we  shall  compute  the  Hessian,  i.e.  second  derivative,  of  J[s].  To 
accomplish  this,  we  need  to  compute  the  derivative  of  u[s],  which  satisfies 
the  normal  equations,  written  in  yet  another  way: 


+ 

dt2  dx2) 
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Therefore 


d 2  2  d2  \ 

dt 2  +a  dxi) 


On  the  other  hand, 


d_ 

dt 


d 2 


d 2 


dt 2  +  *  dx2 


a2 


i _  2  d2 ' 

U2+"  Sx2 


u[5] 


X, 


a2 

<9x2 


8  r  i 

57  “w 


“  “  +  ^  5?)  (*  I  “W)  -  2ff2  5337  “w 


d_ 

dt 


d_ 

dt 


Comparing  equations,  we  see  that 


d2 


d2 


d 


d 


~  [  +  a  7j-J  T.  1  =  ~2ct 


dx2  )  \ds 


dt 


,  d2 

dxdt 


n[s] 


Let  A^.  be  the  inverse  of  -( d2/dt 2  +  a 2d2/dx2)  :  Hi  ->  Hf1 .  Then  we  have 


5J «[.]  =  -x|  uM-  U[S). 

Differentiating  the  result  of  Theorem  2.5,  we  have 
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where  we  have  used  the  antisymmetry  of  x  —  as  before.  Since  both  terms 
define  continuous  quadratic  forms  in  z  €  F,  we  have  proved 

Theorem  2.6 


d 2 


d 


Js2  -  2<j2 


dt 


The  next  results  give  together  a  fairly  precise  characterization  of  the  shape 
of  Ja. 


Corollary  2.7  Suppose  that  u[s]  ^  0,  —  u[s]  =  0.  Then  s  is  a  strict  local 

ox 

minimizer  of  Ja[s\  for  any  o  >  0. 


Proof.  From  Theorem  2.5,  —  J[s]  =  0,  while  from  Theorem  2.6, 

as 


d2 

ds 2 


J[s]  =  2a2 


du[s\n2 

dt  11 


>0, 


28 


since  u[s]  ^  0  and  u[s]  has  mean  zero.  q.e.d. 

The  periodic  1-variable  Sobolev  spaces  H*er  are  defined  to  be  the  com¬ 
pletions  of  C™T\ 0,  T ]  in  the  norms 


HL2  =  E  1  + 


4n27r 

~j^T 


2  _2  \  * 


|uJ2 


Corollary  2.8  Suppose  that  the  datum  z  £  F  is  consistent,  i.e.  for  some 

d  7  J 

s*  6  IR,  u*  G  H\tT)  z  =  —Ts»u*.  Then  s ’  is  a  strict  local  minimizer  of  Ja 


for  all  a  >  0. 


Proof.  Indeed  if  we  set  u(x,t)  =  u*(t)  then  evidently  u  —  u[s*], 
du[s*]/dx  =  0,  and  J[sm]  =  0,  so  the  conclusion  follows  from  the  last  corol- 
larY-  q.e.d. 

It  is  convenient  to  introduce  notation  for  norm  ratios:  for  u  €  (J  H 
define 


per 


-fts.tM  = 


U 


whenever  the  latter  makes  sense. 


Theorem  2.9  Suppose  that  the  datum  z  is  consistent,  i.e.  z  =  j^Ts*u*  for 
s*  £  IR,  u*  £  H\„.  Then  with  R3>t  := 


(a)  If  additionally  u*  £  Hf.[0,T],  then 


l 


z  —  const. 


«L  ,  % 


V 


s  —  s’ 


“  111 


) 


f  D2  R\ 

<  Jo[s\  <  I  \\z\\  +  const.  <  - - +  -5’1 


\S  -  s* 


«  1 


(b) 


dJa 


ds 


s]|  >  o'2  |s 


f  8  i 

—  -5*|  \\u\\l  |—  —  const. RqA  —  const. y/a  R±x  l 
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(c)  If  \fo  Ri  t  and  Rq  i  are  suffiently  small,  then  is  quasiconvex  over 
[-1-1]- 2’ 


Remark.  Before  giving  the  proof  of  Theorem  2.9,  we  discuss  its  meaning. 
The  norm  ratios  Rsj  measure  oscillation.  For  example 


'  .  nnt' 

rnr 

sin  — — 

zz 

T  \ 

T 

This  gives  us  a  way  to  interpret  the  results  in  terms  of  the  shortest  and 
longest  significant  wavelengths  in  an  essentially  bandlimited  target  signal  um . 

The  bound  (a)  means:  |Ja[s]  —  ||z||2|  is  small,  provided  that  a  is  much 
larger  than  the  largest  frequency  (upper  bandlimit),  and  that  |s  —  s*|  is  larger 
than  the  longest  wavelength  (corresponding  to  the  lower  bandlimit).  That  is, 
as  a  — ►  oo,  Ja  becomes  flat  if  the  slope  s  is  “more  than  a  wavelength  in  error”. 
This  is  exactly  the  behaviour  observed  in  Section  1  for  the  mean  square  error 
function,  and  justifies  the  intuitive  notion  that  Ja  must  approximate  the 
mean  square  error  for  large  a. 

The  bound  (b)  states  that  Ja  has  only  one  critical  point,  so  long  as  the 
longest  wavelength  in  u *  is  sufficiently  short.  Moreover  the  yfcr  R\  x  term 
suggests  that  larger  a  can  be  allowed  for  shorter  wavelengths.  Otherwise 
put:  J„  is  quasiconvex  so  long  as  cr  ft  1  and  um  is  sufficiently  oscillatory. 

Proof.  Assume  for  the  time  being  that  u*,  hence  u[s],  is  smooth.  The 
result  as  stated  will  follow  from  continuity. 

As  in  the  proof  of  Theorem  2.1, 


[s](x,f)  = 


OO 

E 

k  >  0 


3  i  i 
Ftuis] 


COS 


-1  k,n 


2irtnt 

e  r 


so 


d 


—  u[s](±X,t)=  ]T  (±1)* 

ot  k  >  0 


d_ 

dt 


u[s] 


2nint 

e~T^~ 


Here  for  n^O, 
d 


dt 


“M 


kn 


1  rX  ,r 

- — -  /  dx  dt 

+  Oo k)  J-X  Jo 


XT{  1  +  80k) 


cos 


J  k,n 


kirx  kit 


2X 


d 


+  —  }  e  ~^~Qt 
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Of  course 


Akn 


Evidently, 


d 


d 2 


dt  N°  [ T-°di2  T°'U* 


J  kn 
-1 


27 rin  ( n2n2  0  k2n2\  4n2n2 


J-2 


+  V 


4X2 


jt2 


>Un«n 


27 rin 


1  +  <r2 


T2  P 


16X2  n2 


-l 


AknK 


dt 


=  0,  so  the  sum  on  n  implicitly  avoids  u  =  0. 


fco 


^/_%x{eXp[.v( 


—  s)  k  \  kn 

~T  +2x)  X  +  T 


+  exp 


2n(s*  —  s)  k  \  kn 

in  I - 1  x - 

T  2X  2 


ik  .  (  (2 n(s*-s)X  P 
2  51  no  H - T - +2, 


i  k  .  (  ( 2 n(s*  —  k 

H — —smc  n  - — - 1 — 

2  \  V  T  2 


<  1  uniformlv.  Thus 


oo  oo 

E  E 

n  =  -oo  k=0 
n  ^  0 


d_ 

dt 


kn 


,  ^  tav  2“  c  2  r  p 

s  E  -75-  Kl  E  1  +  * 

n  =  -00  -1  k=0  \ 

n  ^  0 


16X2  n2 


-2 


MU’ 


<  E 


n  =  —00 
n  ^  0 


4n27r2 


i«;i2  ia 


On 
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+ 


const.  ,22,  n37r3,,„,o 

-r-  E  ~f5~  Kl 

u  n  =  —  oo 

n  ^  0 


by  the  simple  domination  of  a  sum  over  k  by  an  integral.  On  the  other  hand 

1 


|  Jd()n  |  5: 

so  we  conclude  that  for  s  7^  s* 


n|s  —  5*|  ’ 


n  /  0 


d  ,  , 


<  const. 


11“  llo 

\s  —  5*|' 


Ml 


+ 


<  const.  ||  w*  ||i 


K,  ,  Rh 


5  —  5* 


+ 


Similarly 
<9u[s] 


(*>0  =  ]£ 


’  ^  2X  V  +  ^  4 X2  n2)  k 

{(  Jc7T  X  JcTV  \  1 

sin  1  -^r  +  —  J  >  is  also  orthogonal  on  [— X,  X] 


( kirx  kir 
sin  I  +  — 


V  2X 


so 


du[s] 


dx 


k2  7T 


-  E  ki2E^(i+-22W2t)  I-4*" 


h= 0 


T 2  fc2' 


4X2  n2 


const. 


<  — — E—  1 1 1Z*  II  3  =  — ~S^"  E>2 


by  another  simple  integral  majorization.  The  first  part  of  the  theorem  follows 
immediately  via  the  triangle  inequality. 

Since  T z*  is  a  function  of  t  +  ( s *  —  s)x  it  follows  that 

0  =  “  {§r* + a?)  (I;  - (s’  -  B)§i)  “w  ■ 
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Since  u[s]  obeys  the  homogeneous  Neumann  condition  at  x  =  ±X,  we  can 
view 

”:=  {§i  ~ {s' ~ 3)  m) uW 

as  the  solution  of  the  boundary  value  problem 

/  d2  d2  \ 

~  (ae  +  ^  ”  = 0  in  [-- v.  -v]  x  [o,  t] 

v(±X,  ■)  =  (s'  ( ±x ,  •) 

v  is  T-periodic  in  t  . 

Straightforward  estimation  shows  that 


lull2  <Ca(sm  —  sf 


<9ti[s] 

dt 

x=±X 

Using  the  Fourier  expansion  developed  previously,  we  can  write 


^(±V,<)  =  t  £ 


n=-oo  k>0 


dt 


J  kn 


2irint  I  kxx  kn 

cos - 1 - 

2X  '  2 


x-±X 


so 


OO 


=  £ 


27rtnt 


e  t  £(T1)* 

n=— oo  fc>0 


dt 


kn 


<9u[s] 


dt 


(±X,  •) 


~  4nV  j 


W(1  +  'i 


=  E 


00  n2  7T2 


rp2 


E  {Ti)k  1  +  <t2 


T2  k2' 


-1 


fc=— c 


16X2  n2 


SinC  7T 


2n(s“  —  s)A"  k 

rn  rv 
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For  each  n  £  Z,  denote  by  &[n  the  unique  k  6  Z  satisfying 


k  + 


4n(s  —  ,s*)A 


< 


Split  the  sum  over  k  above  into  three  pieces,  according  to  whether  k  —  &[n] 
is  zero,  odd,  or  even: 


E  (T*?  1  +  <t2 


T2  k2 


k=-c 


16A2  n2 


-l 


Sine  7T 


2n(s*-s)A  k' 
T  +  2 


T 7  %]2\-‘  .  ..  . 

7  —  )  smc  1  AM 


sin7rA[n]  ^2, 

k  =  — oo 
k  ^  0 


7T 


1  +  (%l+ih!i  (A[„]  +2k+  l)'1 


16  A' 2 


n* 


(i  +  ^2 


T 2 

16  A2 


(fc[n]  +  4k  +  2)2\ 

n2  y 


(A[n]  +  2k  -f-  1)  1 


±  t  sin  7r(A[n]  +  |)  g 


7T 


k=—oo 


1  +  cr2 


r2  (%]  +  4fc  +  l)2' 


-l 


16A2 


(A[n]  +  2k  +  -) 


-  1  +  <j2 


T 2  (*[n]  +  +  3)2 


16  A2 


n* 


(A[n]  +  2k  + 


Here 

A[n]  =  ~  ® 

L  J  T  ~  2 

(so  |A[n]|  <  all  n),  and  the  sums  have  been  “unrolled”  a  bit  take  advantage 
of  the  periodicity  of  sin. 

One  easily  sees  that 


k[n]T 
4  An 


+  (s  -  s’) 


T 

8  An 


whence  follows  a  uniform  bound  on  the  first  term. 
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to  |  w 


Set 


„  ,  A  ,  cr2T2  (k[n]  +  2x)2Y1 

/W  =  +  TiX*  - j  W« 


]  +  *)  1  • 


Then  the  second  sum  can  be  written 


E  /(2*)  -  /(2t  +  1) 

k  =  —  co 
k  #  0 


and  the  third  as 

£/(2*+i)-/(2*  +  f). 

The  arguments  of  /  in  the  sums  avoid  the  interval  (— |).  Outside  of  this 
interval 

1/0*01  <  4  • 

In  addition  to  the  pole  at  x  =  — A[n],  /  has  at  most  two  critical  points,  so  the 
two  half-axes  (— oo,  4]  and  [4,  oo)  are  divided  into  at  most  four  subintervals  on 
each  of  which  f(x)  is  monotone.  Thus  each  sum  can  be  grouped  into  at  most 
four  subsums,  each  of  which  is  an  alternating  sum  of  a  monotone  sequence. 
Each  of  these  is  estimated  by  the  following  simple  principle:  Suppose  {an} 
is  a  monotonic  sequence  with  a_  <  an  <  a+,  all  n.  Then 


OO 

T.  °2n+l  —  <*271+2 
71=0 


<  a+  —  a_  . 


Consequently 


E 

k  =  — oo 

k  #  0 


f(2k)-f(2k  +  l) 


<  32 


and  a  similar  bound  holds  for  the  third  term.  In  view  of  the  previous  esti¬ 
mates  of  k[n],  A[n]  this  gives  a  uniform  bound  on  the  sums.  We  obtain 


<9u[s] 

dt 


(±X, 


<  const. ||u*||j  . 


35 


Similarly, 


whence 


du[, s]  ^ 

— (±A,  •)  <  const. ||u*||i 

Ot  ,  a 


dJ°  _  o_2 /du[s\  du[.s]\ 

ds  1  J  \  dt  '  dx  / 

=  -2 a\s  -  s’)  2 

’  dt 


5u[s]  <9u[s]  dub 

^T'ir-(s-s)^r 


£w|  * 

—  const.cr*  |s  —  3*|  ||u* jjj  ||u*||i 

From  the  Fourier  expansion  of  — ■  ■  derived  before, 

at 

^[fl2=l  f.  4n27T2  2 

dt  2  ni^oa  F2  |Unl 

n  ^  0 

Ju  h  ('  (Aw  +  9)  f  (‘ +  **  iSr 
-  5  Jl  *-¥-  |u”|!|5i"c  1  A>n>i2  (> + ^  t^M) 

simply  by  dropping  all  terms  with  k  ^  0.  From  the  definition  of  k[n], 

MMI  =  +  <l 


|sinc  7rA[n]|  > 
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where  the  O(^)  bound  is  uniform  over  any  finite  interval  of  a.  Therefore 
<9u[s]  2  ^  4  du*  2  ||  l|2 

sr  -  n  nr  - const- »“»« 

£  ~  const.ftj.,)  ||u'||J  . 

These  estimates  together  establish  the  second  part  of  the  theorem.  q.e.d. 

The  next  theorem  establishes  properties  (ii)  and  (iii)  of  an  acceptable 
estimator  for  the  output  of  a  local  optimization  algorithm  applied  to  Ja,  as 
discussed  in  the  introduction. 

Theorem  2.10  Suppose  that  0  <  a  and  that  um  6  H^er  is  sufficiently  oscil¬ 
latory  relative  to  a,  as  in  Theorem  2.9  (b),  (c).  Let  z  €  F and 


be  the  relative  error  level,  where  z *  =  0[s*,u*].  If  s  is  a  critical  point  of  J„, 
then 

...  e 

|s  —  s  |  <  const.—  . 

(T 

Proof.  Recall  that  u[s]  is  the  solution  of 

f  +  ”  dx2 ) "  at  T~"z 

plus  boundary  conditions.  Denote  by  Uo[s]  the  solution  of  the  same  problem 
with  2  replaced  by  2*.  From  Theorem  2.1, 

IMfl  -  “oMIIj,, 


z^(u[s]  -  u0[s])  +<T2  J-(u[s]  _  u0[s]) 


<  I \z  —  2*11  . 
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Thus 


d  t  r_i  _2/du[s ]  0ti[s]\ 

=  \~aF’~dT ) 


5u0[s]  #u[s]  du0[-s]\ 

dt  dx  dx  ) 


au0[s]  dti0[s]  \  ) 
dt  dx  / J 


Denote  by  J*[s]  the  functional  J„  with  data  2*.  Then  using  Theorem  2.1 
again, 


<  2cr\\z  -  2*||(||.z||  +  ||z*||)  <  const. ae\ 


u 


As  noted  before,  if  u*  is  sufficiently  oscillatory,  the  quantity  in  Theorem 
2.9(b)  obeys 


—  —  const. TZ0  i  —  const. \Za7Zi ,  :=  const.  >  0  . 

7 r2  ’  2’1 

Combining  Theorem  2.9(b)  with  the  above  inequalities  we  obtain 

>  const.  (cr2|s  —  s*|  —  cr  e)  ||u*||2  , 

so  if  s  is  a  critical  point  of  JCT,  then 

|s  —  sm |  <  const.—  . 

(7 

q.e.d. 

Up  to  this  point  we  have  concentrated  entirely  on  the  extent  to  which  s 
can  be  estimated.  Concerning  the  waveform  u,  we  have  the  following  result, 
the  proof  of  which  is  straightforward: 
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Corollary  2.11  Let  the  hypotheses  be  as  in  Theorem  2.10  and  suppose  that 
s  =  argmin  Ja  [5] .  Then  the  estimator 


m  =  hl-xixn>r(x’t) 


satisfies 


[T  dt  f  dt'  (u'{t')-U{t')) 
Jo  Jo 


<  const.—  . 

<7 


Remark.  That  is,  the  estimate  of  u  is  Lipschitz  continuous  with  loss  of 
one  derivative.  It  is  easy  to  see  that  this  result  is  sharp. 

We  have  illustrated  the  foregoing  results  through  numerical  experiments. 
We  used  the  data  displayed  in  Figures  1.4  and  1.5,  which  led  to  the  opti¬ 
mal  mean-square  error  curves  in  Figures  1.6  and  1.7.  Figure  2.1  shows  the 
differential  semblance  objective  function  Ja  for  the  data  of  Figure  1.4,  using 
the  weight  a  =  1.  This  result  is  directly  comparable  to  that  of  Figure  1.6, 
and  the  contrast  in  Figure  2.1  is  essentially  quadratic.  Figure  2.2  displays 
the  analogous  result  for  the  noisy  data  of  Figure  1.5,  using  <7=1.  While  the 
values  of  the  objective  function  are  very  much  greater  (by  an  order  of  mag¬ 
nitude),  and  the  curve  much  shallower,  the  general  shape  remains  quadratic 
with  the  correct  minimum. 

One  might  expect  a  larger  error  in  the  estimate  of  s  on  the  basis  of 
Theorem  2.10.  Apparently  the  completely  incoherent  nature  of  the  data  noise 
in  Figure  1.5  leads  to  anomalously  small  effect  on  the  differential  semblance 
estimator.  To  show  that  coherent  noise  has  a  much  stronger  effect,  we  use  the 
data  in  Figure  2.3  (the  sum  of  Figure  1.4  and  its  mirror  image)  to  produce  the 
plot  of  J fj  in  Figure  2.4.  The  curve  is  still  quite  convex,  but  the  minimizer  has 
moved  to  s  =  0  (unsurprisingly).  While  these  examples  are  suggestive,  the 
precise  influence  of  the  nature  of  noise  on  the  differential  semblance  estimator 
remains  to  be  explained. 

Our  final  examples  illustrate  the  conclusion  embodied  in  Theorem  2.9(a), 
that  as  <7  — >  00  the  differential  semblance  objective  approximates  the  output 
least  squares  objective,  and  in  particular  tends  to  become  saturated  away 
from  the  global  minimizer.  Figures  2.5  and  2.6  show  the  function  Ja  for 
<7  =  10,  using  the  data  of  Figures  1.4  and  1.5  respectively.  The  minimum 
is  more  sharply  defined,  as  is  suggested  by  Theorems  2.9  and  2.10,  and  the 
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curves  flatten  out  away  from  the  global  minimizer.  Even  in  this  example, 
though,  Ja  remains  quasiconvex  and  smooth,  and  could  easily  be  minimized 
by  a  properly  designed  quasi-Newton  algorithm. 
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Captions  for  figures 


Figure  1.1 


Figure  1.2 
Figure  1.3 

Figure  1.4 

Figure  1.5 

Figure  1.6 

Figure  1.7 


Plot  of  the  reduced  least  squares  objective 


«/optM  =  uginf  ^\\<f>[s,u]- z\ 


where 


z(x,  t )  =  sin(u>t) 


and  lj  =  50.0.  Here  —0.5  <  x  <  0.5, 0.0  <  t  <  1.0  and  the  values 
of  Jopt  are  computed  analytically  as  explained  in  the  text.  Since 
the  mean  square  residual  has  already  been  optimized  over  the 
waveform  u  here,  the  many  local  minima  of  Jopt  actually  repre¬ 
sent  local  minima  of  the  output  least-squares  objective  function. 


Same  as  1.1  with  ui  =  500.0. 


A  Ricker  wavelet  (second  derivitave  of  a  Gaussian  pulse)  with 
peak  frequency  20  Hz. 


Plane  wave  data  with  u  =  Ricker  waveform  of  Figure  1.3,  slope 
s  =  0.1. 


Same  as  1.4,  with  160%  added  pseudorandom  noise  filtered  with 
the  waveform  of  Figure  1.3. 


Jopt  for  the  noise-free  data  of  Figure  1.4. 


Jopt  for  the  noisy  data  of  Figure  1.5;  note  the  appearance  of  local 
optima. 


Figure  2.1  The  DSO  objective  function  for  the  noise- free  data  of  Figure  1.4, 
(7=1. 


Figure  2.2  The  DSO  objective  function  for  the  noisy  data  of  Figure  1.5, 
(7=1. 


Figure  2.3  Dataset  formed  by  summing  Figure  1.4  with  its  mirror  image; 

thus  composed  of  the  data  for  s  =  0.1  and  s  =  —0.1  in  equal 
measure.  Either  may  be  regarded  as  coherent  noise. 

Figure  2.4  The  DSO  objective  function  for  the  “coherent  noise”  data  of 
Figiure  2.3,  <7=1. 


Figure  2.5  Counterpart  of  Figure  2.1  with  <7  =  10. 


Figure  2.6  Counterpart  of  Figure  2.2  with  <7  =  10. 
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Plane  wave  data,  no  noise,  slope  =  0.1 
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Least  Squares  Scan,  160%  noise 
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Plane  wave  data,  100%  coherent  noise 
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DS  scan,  100%  coherent  noise 
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